Vote Aggregation as a Clustering Problem
نویسنده
چکیده
An important way to make large training sets is to gather noisy labels from crowds of non experts. We propose a method to aggregate noisy labels collected from a crowd of workers or annotators. Eliciting labels is important in tasks such as judging web search quality and rating products. Our method assumes that labels are generated by a probability distribution over items and labels. We formulate the method by drawing parallels between Gaussian Mixture Models (GMMs) and Restricted Boltzmann Machines (RBMs) and show that the problem of vote aggregation can be viewed as one of clustering. We use K-RBMs to perform clustering. We finally show some empirical evaluations over real datasets.
منابع مشابه
Robust Method of Vote Aggregation and Proposition Verification for Invariant Local Features
This paper presents a method for analysis of the vote space created from the local features extraction process in a multi-detection system. The method is opposed to the classic clustering approach and gives a high level of control over the clusters composition for further verification steps. Proposed method comprises of the graphical vote space presentation, the proposition generation, the two-...
متن کاملEIDA: An Energy-Intrusion aware Data Aggregation Technique for Wireless Sensor Networks
Energy consumption is considered as a critical issue in wireless sensor networks (WSNs). Batteries of sensor nodes have limited power supply which in turn limits services and applications that can be supported by them. An efcient solution to improve energy consumption and even trafc in WSNs is Data Aggregation (DA) that can reduce the number of transmissions. Two main challenges for DA are: (i)...
متن کاملOnline Aggregation of Coherent Generators Based on Electrical Parameters of Synchronous Generators
This paper proposes a novel approach for coherent generators online clustering in a large power system following a wide area disturbance. An interconnected power system may become unstable due to severe contingency when it is operated close to the stability boundaries. Hence, the bulk power system controlled islanding is the last resort to prevent catastrophic cascading outages and wide area bl...
متن کاملA New Method for Duplicate Detection Using Hierarchical Clustering of Records
Accuracy and validity of data are prerequisites of appropriate operations of any software system. Always there is possibility of occurring errors in data due to human and system faults. One of these errors is existence of duplicate records in data sources. Duplicate records refer to the same real world entity. There must be one of them in a data source, but for some reasons like aggregation of ...
متن کاملGeneralized Cluster Aggregation
Clustering aggregation has emerged as an important extension of the classical clustering problem. It refers to the situation in which a number of different (input) clusterings have been obtained for a particular data set and it is desired to aggregate those clustering results to get a better clustering solution. In this paper, we propose a unified framework to solve the clustering aggregation p...
متن کامل